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In complex manufacturing a considerable amount of resources is focused on training 
workers and developing new skills. Increasing the effectiveness of those processes and 
reducing the investment required is an outstanding issue. In this paper, we present an 
experiment (n = 20) that shows how modern metaphors such as collaborative mixed 
reality can be used to transmit procedural Knowledge and could eventually replace 
other forms of face-to-face training. We implemented a mixed reality setup with see- 
through cameras attached to a Head-Mounted Display. The setup allowed for real-time 
collaborative interactions and simulated conventional forms of training. We tested the 
system implementing a manufacturing procedure of an aircraft maintenance door. The 
obtained results indicate that performance levels in the immersive mixed reality training 
were not significantly different than in the conventional face-to-face training condition. 
These results and their implications for future training and the use of virtual reality, mixed 
reality, and augmented reality paradigms in this context are discussed in this paper. 


Keywords: mixed reality, immersive augmented reality, training, manufacturing, head-mounted displays 


INTRODUCTION 


Modern mass assembly lines for high value manufacturing are either robotized or rely heavily on 
skilled workers. Nevertheless, training new workers in complex tasks is an outstanding challenge 
for the industry (Mital et al., 1999). On one hand, it involves having to dedicate limited physi- 
cal equipment and professionals to instruct new personnel (Bal, 2012). On the other hand, the 
operation of dangerous equipment can rise health and safety concerns (Sun and Tsai, 2012). In 
this context, the use of novel technologies to train future workers on the processes could both 
increase the safety and reduce the training costs, which would eventually translate into an increase 
in productivity. 

Up to now, several computer-based approaches have been proposed as alternative methods 
for reducing the impact of these hurdles in industrial training. Previous work includes the use of 
Virtual Environments which allow users to practice and rehearse situations that might otherwise be 
dangerous in a real environment (Williams-Bell et al., 2015). Despite some controversy with respect 
to the efficient transfer to real-life setups of the skills trained in Virtual Environments (VE) (Kozak 
et al., 1993), these approaches have been successfully used for training in a variety of disciplines 
including health and safety (Dickinson et al., 2011; Kang and Jain, 2011), medical training (Bartoli 
et al., 2012; Gonzalez-Franco et al., 201 4a,b), fire services (Williams-Bell et al., 2015), and industrial 
training (Muratet et al., 2011). In the context of industrial setups, several studies have examined 
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the effects of virtual training (Oliveira et al., 2000; Stone, 2001; 
Lin et al., 2002), finding that this type of training significantly 
improved users’ skills in equivalent real scenarios particularly 
when they reproduced a face-to-face Virtual Physical System 
(Webel et al., 2013; Bharath and Rajashekar, 2015). 

However, most computer-based training systems are of low 
fidelity. To a extent that they are not realistic enough to completely 
replace conventional face-to-face training in complex manufac- 
turing. This is partially due to the fact that in real life workers 
have access to physical equipment which they manipulate on 
demand, whereas computer-based training requires a digital 
version to be created. Using Augmented Reality (AR), workers 
can achieve higher levels of fidelity to make digital training more 
tangible. In fact, a prior study (Gavish et al., 2015) compared 
the use of AR in training for manufacturing and maintenance 
scenarios to video instructions and non-immersive computer 
training. The authors found that AR groups tended to perform 
better after training when compared to the groups that were 
only shown a video with the instructions. However, they did not 
find significant differences between computer training and AR 
groups, and they argued that a ceiling effect was likely the cause. 
Additionally, this study did not compare the performance of real 
face-to-face training to the performance of AR training. Other 
authors have explored the advantages of AR in the guidance of 
an assembly process (Yuan et al., 2008), with results indicating 
that AR is an effective method to improve performance. This 
is consistent with compiled reviews on the state of art in AR 
applications (Ong et al., 2008), since several AR-based training 
scenarios have been developed (Webel et al., 2013). In many AR 
applications, the user needs to hold a device with his hands to 
experience the augmentation. In this context, head-mounted 
devices are the only ones that can provide a hands-free experi- 
ence and potentially a better face-to-face Virtual Physical System 
(Webel et al., 2013). 

Face-to-face interaction is indeed a prominent characteristic 
of assembly training that seems to play a great role in learning 
(Lipponen, 2002). To achieve those levels of immersive interac- 
tion capable of providing better face-to-face training, we turn to 
Mixed Reality (MR) and Virtual Reality (VR), where it has been 
shown that objects can be manipulated naturally and from a first 
person perspective when the participants, position and move- 
ments are tracked (Chen and Sun, 2002; Spanlang et al., 2014). 

Indeed, Immersive VR applications are especially powerful 
when participants experience the Presence illusion: the feeling 
of actually “being there” inside the simulation. Presence has 
been described by a combination of two factors: the plausibility 
of the events happening being real and the place illusion, the 
sensation of being transported to a new location (Sanchez-Vives 
and Slater, 2005; Slater et al., 2009). These illusions, especially 
when combined, can produce realistic behaviors from partici- 
pants (Meehan et al., 2002). In this context, VR has successfully 
reproduced classical moral dilemmas to find out how people 
react without compromising their integrity (Slater et al., 2006; 
Friedman et al., 2014). Similarly, these realistic behaviors can 
also influence training, and several authors have already used 
VR as a tool for training and rehearsal in medical situations 


(Seymour et al., 2002; von Websky et al., 2013), disaster relief 
training (Farra et al., 2013), and other skill trainings related to 
motor control (Kishore et al., 2014; Padrao et al., 2016). However, 
while VR may be an excellent approach for isolated training, it 
is increasingly complex to use for collaborative training or face- 
to-face setups (Churchill and Snowdon, 1998; Monahan et al., 
2008; Bourdin et al., 2013; Gonzalez-Franco et al., 2015). In such 
scenarios, systems require several computers, complex network 
synchronization, and labor-intensive application development. 
Furthermore, aspects of self-representation and virtual body 
tracking become of major importance (Spanlang et al., 2014), 
as to collaborate and communicate in face-to-face scenarios we 
usually turn to body language (Garau et al., 2001). 

One approach to overcome the self-representation issue and 
simplify the tracking systems is to use mixed reality paradigms 
in see-through calibrated Head Mounted Display (HMD) enable 
the exploration of digital objects from a first person perspective 
but also allow to see the real setup with collocated real objects and 
people (Steptoe et al., 2014; Thorn et al., 2016). This paradigm 
is particularly interesting for collaborative scenarios where both 
instructor and trainee are together in the same space, and not 
remotely located. With this technology, participants can see the 
instructor guiding them through the process, but without the 
possible physical harm of the real operation. Additionally, a high 
degree of presence and a hands-free experience is guaranteed. 

In this paper, we validate whether a MR setup could work 
for complex manufacturing training and we compare the results 
to conventional face-to-face training done on a physical scaled 
model. 


MATERIALS AND METHODS 


Apparatus 

We built a mixed reality setup by modifying an Oculus Rift DK1 
HMD with a 1,280 x 800 resolution (640 x 800 per eye), a 110° 
diagonal field of view (FOV) and approximately 90° horizontal 
FOV. A pair of cameras were mounted to the HMD to form a 
see-through mixed reality setup as in Steptoe et al. (2014) and 
Thorn et al. (2016). The scenario was implemented in Unity 
3D, and the head tracking was performed with a NaturalPoint 
Motive motion capture system (24x Flex 13 cameras) running 
at 120 Hz and streaming the head’s position and rotation to 
our application with centimetric precision. With this informa- 
tion, we could display the virtual objects from a first person 
perspective providing strong sensorimotor contingencies as the 
participant moved his/her head (Gonzalez-Franco et al., 2010; 
Spanlang et al., 2014). Using the same camera capture system, 
objects in the real world with attached reflective markers were 
tracked and corresponding spatial coordinates were calculated 
to render 3D objects. The 2D feed of the camera was rendered 
into planes in the background of the HMD and was calibrated 
to match the 3D spatial axis using the camera lenses and HMD 
specifications (Steptoe et al., 2014; Thorn et al., 2016). The camera 
lenses optical distortion was also corrected in real time with a 
shader with camera calibrations (Zhang, 2000). Although the 
frame rate of the cameras was less than the one featured by the 
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HMD’s (~45 and 60 Hz, respectively) the system was operative 
in real time with minimal perceptual lag. Indeed, none of the 
participants reported simulator sickness when operating the tech- 
nology. To interact with the virtual jig, we attached a rigid body 
reflective marker to an Ipow z07-5 stick; this way, participants 
could view a virtual object matching the position of the marker 
and press the button to interact with the virtual jig (see Video S1 
in Supplementary Material). This MR system allowed multi-user 
collaborations where different participants could interact with 
each other through a PhotonServer installed in the laboratory 
(Figure 1, see Video S1 in Supplementary Material). 

For the conventional face-to-face training condition, we 
manufactured a laser-cut physical model of the jig in transparent 
plastic (see Video S1 in Supplementary Material). 


Participants 
Twenty-four volunteers (age mean = 32.5, SD = 9.6 years, three 
females) participated in the user study. Due to the confidential 


nature of the manufacturing content, this study was conducted 
using only employees from the institution. Participants who 
volunteered for the study did not have previous manufacturing 
knowledge and were asked to complete a demographic question- 
naire before participating. Following the Declaration of Helsinki 
all participants gave informed consent. This study was approved 
by the Science and Engineering Research Ethics Committee 
(SEREC) of Cranfield University. 


Procedure 

We reproduced an aircraft maintenance door training manual 
in our MR setup. Through the proposed training, new operators 
are expected to achieve a reasonable level of knowledge of the 
assembly procedure before they are exposed to the real physical 
manufacturing equipment. Manufacturing of civil aircrafts is 
subjected to strict procedures due to the legal and safety implica- 
tions of non-conformities. In this context, the ultimate goal of 
the training is to reduce the Cost of Non-Conformance (CONC) 


Participant's view. The Interaction wand in is represented by the green actuator. 


FIGURE 1 | Mixed reality setup. (A) Trainer's view, see-through with the virtual assembly jig. (B) Laboratory equipped with 24 motion capture cameras, and two 
participants wearing the mixed reality setups set for collaboration: the trainer is carrying the interaction wand while the second person observes the operation (C). 
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FIGURE 2 | Experimental design and procedure. 
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via a more interactive and cost-effective approach to minimize 
product defects or deviations from the design during production. 
The experiment implemented two different training conditions: 
(i) conventional face-to-face training, where participants were 
taught in a traditional face-to-face scenario manipulating a scaled 
assembly jig; and (ii) MR training. In the MR, participants were 
taught in a face-to-face scenario with a see-through HMD. This 
approach facilitated collaboration over a rendered digital model of 
the assembly jig and enabled virtual interactions when necessary 
for the training. This setup also implemented the manipulations 
and interactions with the jig necessary for the training. In both 
conditions, participants underwent the same procedural script 
obtained from a complex manufacturing manual of an aircraft 
maintenance door. Participants were then evaluated to assess how 
much knowledge they captured during the training (Figure 2). 
The training process was complex enough that it was not feasible 
to complete the tests successfully without previous training, but 
still procedural enough that, with a single training exposure, 
participants could complete the task and tests. 

Participants were randomly assigned to one of the two 
experimental conditions in a between subjects’ study and under- 
went the following phases after completing the demographic 
questionnaire: 


Training 

The trainer performed the inspection and operated the moving 
parts of a door assembly jig following the manual. During this 
phase, the trainee had to observe what the trainer was doing and 
tried to remember as much as possible for the evaluation phase. 


Evaluation 

After the training, the trainee was asked to complete two tests 
(a knowledge retention and a knowledge interpretation test) to 
compare both types of trainings. The knowledge retention test was 
a written test using a multiple-choice format with eight questions 
(Table 1). This test was designed to evaluate how much factual 
knowledge was retained from the training (Kang and Jain, 2011). 
The knowledge interpretation test evaluated whether the whole 
procedure of the assembly was properly captured. This test was 
executed in a scaled physical jig and the trainee was asked to per- 
form step by step significant parts of the assembly training until 
completing the whole operation. If at any point the participant 
skipped a step or required intervention from the experimenter 
(e.g., one of the drills was not performed), this reduced one point 
in the score. The maximum score was 43, the equivalent to the 
sum of actions that were required to complete the operation. 


RESULTS 


Knowledge Capture 

No significant differences were found for knowledge retention 
(scores from 0 to 8) between the two conditions [Kruskal-Wallis 
rank sum test ¥*(1) = 0.1, p = 0.7]. The score for the MR condition 
was (M = 3.75, SD = 1.21), and the score for the conventional 
condition was (M = 3.91, SD = 1.44). Both methods of train- 
ing were not providing significantly different level of factual 


knowledge, even if this was not very high, given that the maximal 
score was 8 and participants in both methods were below that 
score (Figure 3). 

No significant differences were found for knowledge interpre- 
tation (scores from 0 to 43) between the two conditions [Kruskal- 
Wallis rank sum test y7(1) = 1.9, p = 0.16]. The score for the MR 
condition was (M = 35.41, SD = 8.03), and the score for the 
conventional face-to-face condition was (M = 39.25, SD = 4.86). 
Given the high score for both conditions, the procedural training 
can be considered successful (Figure 3). 

We ran an additional Two One-Sided Test (TOST) for 
equivalence and found that for the knowledge retention both 
populations showed a confidence level over 93%, indicating a 
trend in equivalence for the retention between the MR and 
the conventional face-to-face conditions. The same test on the 


TABLE 1 | Questions of the knowledge retention test. 


Knowledge retention questions: 


1. How would you know what personal protective equipment (PPE) you will 

need? 

a) Look up the AIP! list. 

b) Look up the Airbus instruction protective equipment list. 

c) Look up the WI Bill of PPE. 

d) Ask your team leader or a qualified technician. 

2. What PPE do you need to wear? 

a) No PPE is required. 

b) Overalls and safety boots. 

c) Overalls, safety boots, safety glasses, general gloves, general masks. 

d) Overalls, safety boots, safety glasses, chemical gloves, chemical masks. 

8. What do you need to ensure during the cleaning of the jig operation? 

a) Ensure all the pin bolts fit into the bushes correctly. 

b) Ensure all the parts are cleaned to a good standard. 

c) Ensure all the parts are moving and free from interference. 

d) Ensure all the parts are cleaned to CPC and are free from interference. 

4. To prepare the jig to receive the door panel, what do you need to 

disassemble? 

a) The pin bolts and the support plate. 

b) The pin bolts and the drilling templates. 

c) The pin bolts and the hinges. 

d) There is nothing to disassemble; the door panel is fitted directly onto 
the jig. 

5. How many pin bolts are needed to secure the door panel to the jig? 

a) Two 

b) Four 

c) Six 

d) Eight 

6. How many pin bolts are needed to secure the support plate to the jig? 

a) Two 

b) Four 

c) Six 

d) Eight 

7. What do you have to do before fitting the drilling templates? 

a) Check that the lock on the jig hinge is free from any interference. 

b) Inspect the hinge according to the AIPI before rotating. 

c) Rotate the jig with the drilling templates on the upside. 

d) Rotate the jig with the drilling templates on the downside. 

8. How do you fit the drilling templates to the jig? 

a) Claw clamped through the bushes in according to the AIPI. 

b) Fit the template to the jig and pin with pin bolts. 

c) Manual Clamped with G clamps. 

d) Pinned with pin bolts according to the AIPI. 


The test had a multiple-choice format. 
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knowledge interpretation did not show such a high equiva- 
lence and was rejected (p = 0.84); therefore, the knowledge 
interpretation results were not conclusive since although they 
were not significantly different they were also not significantly 
equivalent. 

When studying the relation of both kinds of knowledge 
capture, we find that while in the MR condition a correlation 
trend was found between high scores in the interpretation and 
retention [Pearson r(12) = 0.57, p = 0.052], this was not true 
for the conventional face-to-face training condition (p > 0.39) 
(Figure 4). Moreover, it seems that top performing par- 
ticipants in the MR condition were as good as the ones in the 
conventional training. However, low performing participants 
in the MR were worse. We hypothesize that low performers 
may have been overwhelmed by the setup and that constrained 
their capacity to capture knowledge; however, this effect may 
fade away as participants become more used to the technology 
itself. 


Time 

The time spent to complete the training was significantly higher 
in the MR condition (M = 12.1, SD = 2.5 min) than in the con- 
ventional face-to-face training condition (M = 9.9, SD = 0.9 min) 
[Kruskal-Wallis rank sum test y?(1) = 0.64, p = 0.01] (Figure 4). 
This could be partially due to the extra time some participants 
took to familiarize with the interaction metaphors and the nov- 
elty of the MR setup. 


DISCUSSION 


Overall, we found that the knowledge levels acquired both in the 
mixed reality setup and in the conventional face-to-face setup 
were not significantly different. Very high scores were found 
in the interpretation test in both conditions, scoring over 80% 
of accuracy with a single training session in a manufacturing 
operation that was totally novel to them. However, the train- 
ing process was complex enough that it was not feasible to 
complete the tests successfully without previous training. These 
results validate our training methodology which was a practical 
example of a complex aircraft door manufacturing procedure. 
However, equivalence results failed to show significance between 
participants in the MR and the conventional face-to-face condi- 
tions. A trend was found with 93% confidence of equivalence on 
the retention results obtained by participants of the conventional 
face-to-face training when compared to the MR, which shows 
that MR scenarios can potentially provide a successful metaphor 
for collaborative training. In general, the scores in the retention 
test were low, we hypothesize that there might be two reasons 
to the difference in performance between the retention and the 
interpretation knowledge. First, the complexity of the task might 
require several training sessions to be properly retained. Second, 
we believe that, given the type of training, the participants 
developed a more hands-on memory of the procedure than 
an abstract knowledge. Indeed, many participants were able to 
remember the number of bolts involved in an operation if the jig 
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was presented in front of them, but could not recall the number 
of bolts involved when asked in a written test. We did, however, 
find a correlation between high interpretation and retention 
scores in participants who completed the training through MR, 
such correlation was not found with the conventional face-to- 
face training results. The correlation shows that participants 
who were better in the interpretation task were also better in 
the retention task, while participants who performed poorly 
were bad in both types of tests. These results are aligned with 
previous studies that show higher cognitive load is needed when 
using novel technologies at first (Chen et al., 2007), and the MR 
setup might have placed some participants outside their comfort 
zone, making them unable to remember or guess what to do 
next. This would also contribute toward explaining the results 
that show that participants took longer in the MR condition than 
in the conventional face-to-face condition, because they were 
less familiar with the environment. Nevertheless, the actual 
post-training knowledge scores were not significantly different 
between participants of the MR condition and the physical 
one, thus evidencing the great possibilities in the use of MR for 
complex manufacturing training. We hypothesize that these 
positive results are closely linked to the theories of first person 
interaction with digital objects (Spanlang et al., 2014). 


CONCLUSION 


The current paper has presented and validated the use of mixed 
reality metaphors for complex manufacturing training by run- 
ning a user study and measuring the post-training knowledge 
retention and interpretation scores. The results show trends 


of equivalent knowledge retention between MR training and 
the conventional face-to-face training. However, no significant 
differences or significant equivalences were found between the 
two conditions for knowledge interpretation. These results sup- 
port the idea that MR setups can achieve high performances 
in the context of collaborative training. The implementation 
of this technology in the industry will have several benefits: 
this form of training will not require the physical equipment 
present, which will reduce the costs of training and also 
eliminate security issues and operational hazards. However, 
this setup would not be a complete substitute of a face-to-face 
training, since there will still be a need of professional trainers. 
Therefore, only one part of the overhead training costs would 
be reduced. The implications of these results are clear not 
only for the manufacturing industry but also MR and AR 
community as it shows evidence of how the integration of 
existing metaphors for collaborative work can be implemented 
in immersive MR. 
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